MongoDb Datasource

MongoDb can be used as a datasource through the Mongo "BI Connector" . This allows Mongo to be used:

  • As a datasource in the ETL for data mash ups
  • As a 'pass-through' data source, where queries are run directly against the datasource using schemas and models built in Pyramid

Note: The Mongo BI Connector is only available with the Mongo Enterprise license.

To use the Mongo BI Connector as a datasource during the ETL, connect to the database using the BI adapter. The BI Connector will use a probabilistic schema to determine the best way to analyze the underlying data source. This will provide a view of the MongoDb as an structured database - the objects will appear in SQL tables, which will be split into dimension tables and fact tables.

Key Challenges with MongoDb

  • When using the Mongo connector, the resulting virtual table structure may need some tweaking to be analytically useful. At this time, the best thing would be to ingest it into another database technology and transform it using the data flow tools.
  • Another problem that may occur is that the resulting relationship schema between the tables is incorrect. This often happens with MongoDb structures when its unclear what virtual tables are dimensional and which are fact. In this scenario, its best to manually add the relationships into the schema to properly reflect the data relationships.